Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC: Fix aliases in WHERE clause #115

Closed
wants to merge 1 commit into from

Conversation

Yury-Fridlyand
Copy link

@Yury-Fridlyand Yury-Fridlyand commented Sep 10, 2022

Signed-off-by: Yury-Fridlyand yuryf@bitquilltech.com

Description

Test data
opensearchsql> select int0 i0, int1 i1, int2 i2 from calcs order by i1 nulls last;
fetched rows / total rows = 17/17
+------+------+------+
| i0   | i1   | i2   |
|------+------+------|
| 8    | -9   | 6    |
| 10   | -8   | -4   |
| null | -6   | -4   |
| null | -4   | -5   |
| 1    | -3   | 5    |
| null | 2    | 0    |
| null | 3    | -6   |
| 8    | 3    | -9   |
| null | null | 5    |
| 7    | null | 3    |
| 3    | null | 2    |
| 8    | null | 9    |
| 4    | null | -3   |
| null | null | 0    |
| 4    | null | 4    |
| 11   | null | -8   |
| 4    | null | -9   |
+------+------+------+

BEFORE

opensearchsql> select int0 i0, int1 i1, int2 i2 from calcs where i1 > 0;
{'reason': 'Invalid SQL query', 'details': "can't resolve Symbol(namespace=FIELD_NAME, name=i1) in type env", 'type': 'SemanticCheckException'}
Exception
Client side error during query execution
org.opensearch.sql.exception.SemanticCheckException: can't resolve Symbol(namespace=FIELD_NAME, name=i1) in type env
        at org.opensearch.sql.analysis.TypeEnvironment.resolve(TypeEnvironment.java:55) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.analysis.ExpressionAnalyzer.visitIdentifier(ExpressionAnalyzer.java:308) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.analysis.ExpressionAnalyzer.visitQualifiedName(ExpressionAnalyzer.java:282) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.analysis.ExpressionAnalyzer.visitQualifiedName(ExpressionAnalyzer.java:72) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.ast.expression.QualifiedName.accept(QualifiedName.java:111) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.analysis.ExpressionAnalyzer.analyze(ExpressionAnalyzer.java:91) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.analysis.ExpressionAnalyzer.lambda$visitFunction$0(ExpressionAnalyzer.java:178) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195) ~[?:?]
        at java.util.Spliterators$ArraySpliterator.forEachRemaining(Spliterators.java:948) ~[?:?]
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?]
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?]
        at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) ~[?:?]
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) ~[?:?]
        at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:578) ~[?:?]
        at org.opensearch.sql.analysis.ExpressionAnalyzer.visitFunction(ExpressionAnalyzer.java:179) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.analysis.ExpressionAnalyzer.visitFunction(ExpressionAnalyzer.java:72) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.ast.expression.Function.accept(Function.java:36) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.analysis.ExpressionAnalyzer.analyze(ExpressionAnalyzer.java:91) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.analysis.Analyzer.visitFilter(Analyzer.java:159) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.analysis.Analyzer.visitFilter(Analyzer.java:98) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.ast.tree.Filter.accept(Filter.java:44) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.analysis.Analyzer.visitProject(Analyzer.java:277) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.analysis.Analyzer.visitProject(Analyzer.java:98) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.ast.tree.Project.accept(Project.java:71) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.analysis.Analyzer.analyze(Analyzer.java:121) ~[core-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.sql.SQLService.analyze(SQLService.java:99) ~[sql-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.legacy.plugin.RestSQLQueryAction.prepareRequest(RestSQLQueryAction.java:105) ~[legacy-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.legacy.plugin.RestSqlAction.lambda$prepareRequest$1(RestSqlAction.java:156) [legacy-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.sql.opensearch.executor.Scheduler.lambda$withCurrentContext$0(Scheduler.java:30) [opensearch-2.2.0.0-SNAPSHOT.jar:?]
        at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:747) [opensearch-2.2.0.jar:2.2.0]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
        at java.lang.Thread.run(Thread.java:829) [?:?]

AFTER

opensearchsql> select int0 i0, int1 i1, int2 from calcs where i1 > 0;
fetched rows / total rows = 3/3
+------+------+--------+
| i0   | i1   | int2   |
|------+------+--------|
| null | 2    | 0      |
| null | 3    | -6     |
| 8    | 3    | -9     |
+------+------+--------+

Issues Resolved

[List any issues this PR will resolve]

Check List

  • New functionality includes testing.
    • All tests pass, including unit test, integration test and doctest
  • New functionality has been documented.
    • New functionality has javadoc added
    • New functionality has user manual doc added
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Yury-Fridlyand <yuryf@bitquilltech.com>
@Yury-Fridlyand Yury-Fridlyand changed the title Register aliases in AnalysisContext to resolve identifiers later. Fix aliases in WHERE clause Sep 10, 2022
@Bit-Quill Bit-Quill deleted a comment from codecov bot Sep 10, 2022
@codecov

This comment was marked as spam.

@Yury-Fridlyand Yury-Fridlyand changed the title Fix aliases in WHERE clause PoC: Fix aliases in WHERE clause Sep 10, 2022
@Yury-Fridlyand
Copy link
Author

https://stackoverflow.com/a/942592
You can only use column aliases in GROUP BY, ORDER BY, or HAVING clauses.

Standard SQL doesn't allow you to refer to a column alias in a WHERE clause. This restriction is imposed because when the WHERE code is executed, the column value may not yet be determined.

Copied from MySQL documentation

@MaxKsyunz
Copy link

MaxKsyunz commented Sep 10, 2022

MySQL documentation

Nice find! We can close the issue then.

I checked and SQL Server and Oracle have the same restriction. The way around it is to use a subquery:

SELECT Id, fieldSum FROM (SELECT Id,  fieldA + fieldB AS fieldSum FROM numberTable) where fieldSum > 40

@Yury-Fridlyand
Copy link
Author

Yury-Fridlyand commented Sep 10, 2022

TODOs so far:

  1. Aliases for table names including
    1.1. Joins
    1.2. Subqueries
  2. Aliases in GROUP BY
  3. Aliases in ORDER BY
  4. Aliases in HAVING

Copy link

@MitchellGale MitchellGale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this expected outcome?

opensearchsql> select int0 i0, int1 i1, int2 i2 from calcs where i1 > 0;                                                                                                           
{'reason': 'Invalid SQL query', 'details': "can't resolve Symbol(namespace=FIELD_NAME, name=i1) in type env", 'type': 'SemanticCheckException'}

@Yury-Fridlyand
Copy link
Author

Is this expected outcome?

opensearchsql> select int0 i0, int1 i1, int2 i2 from calcs where i1 > 0;                                                                                                           
{'reason': 'Invalid SQL query', 'details': "can't resolve Symbol(namespace=FIELD_NAME, name=i1) in type env", 'type': 'SemanticCheckException'}

Good point. The error message should be user friendly.

@Yury-Fridlyand
Copy link
Author

TODOs so far:

  1. Aliases for table names including
    1.1. Joins
    1.2. Subqueries
  2. Aliases in GROUP BY
  3. Aliases in ORDER BY
  4. Aliases in HAVING

Well, all these work. There are few bugs posted in SQL repo.
The approach in general is invalid. To resolve aliases we should use ParserContext::peek which returns QuerySpecification and then QuerySpecification::isSelectAlias and QuerySpecification::getSelectItemByAlias.

@Yury-Fridlyand Yury-Fridlyand deleted the dev-spike-alias branch September 12, 2022 23:49
andy-k-improving pushed a commit that referenced this pull request Nov 16, 2024
Signed-off-by: Heemin Kim <heemin@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants